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ABSTRACT: In an attempt to understand what distinguishes severe acute respiratory syndrome (SARS) 
coronavirus (SCoV) from other members of the coronaviridae, we searched for elements that are unique 
to its proteins and not present in any other family member. We identified an insertion of two glycine 
residues, forming the GxxxG motif, in the SCoV spike protein transmembrane domain (TMD), which is 
not found in any other coronavirus. This surprising finding raises an “oligomerization riddle”: the GxxxG 
motif is a known dimerization signal, while the SCoV spike protein is known to be trimeric. Using an in 
vivo assay, we found that the SCoV spike protein TMD is oligomeric and that this oligomerization is 
driven by the GxxxG motif. We also found that the GxxxG motif contributes toward the trimerization of 
the entire spike protein; in that, mutations in the GxxxG motif decrease trimerization of the full-length 
protein expressed in mammalian cells. Using molecular modeling, we show that the SCoV spike protein 
TMD adopts a distinct and unique structure as opposed to all other coronaviruses. In this unique structure, 
the glycine residues of the GxxxG motif are facing each other, enhancing helix—helix interactions by 
allowing for the close positioning of the helices. This unique orientation of the glycine residues also 
stabilizes the trimeric bundle during multi-nanosecond molecular dynamics simulation in a hydrated lipid 
bilayer. To the best of our knowledge, this is the first demonstration that the GxxxG motif can potentiate 
other oligomeric forms beside a dimer. Finally, according to recent studies, the stabilization of the trimeric 
bundle is linked to a higher fusion activity of the spike protein, and the possible influence of the GxxxG 
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motif on this feature is discussed. 


A previously unidentified member of the coronaviridae 
has been shown to be the etiologic agent responsible for the 
recent severe acute respiratory syndrome (SARS)! outbreak 
(1). Viral genome sequencing, in combination with protein 
phylogenetic analyses have shown SARS coronavirus (SCoV) 
to belong to a new subfamily within the coronaviridae (2, 
3). 

As enveloped viruses, coronaviridae contain at least three 
membrane proteins: a small membrane protein (E), an 
integral membrane protein (M), and the spike protein (S or 
E2), a class I viral fusion protein (4). In an attempt to 
understand some of the features that distinguish SCoV from 
other coronaviridae, we performed detailed sequence com- 
parisons between each of the above membrane proteins to 
those found in other coronaviruses. Interestingly, we have 
found that the transmembrane domain (TMD) of the SARS 
spike protein contains an insertion of two glycine residues 


* This work was supported in part by a grant from the Israel Science 
Foundation (784/01) to LT.A. 

* To whom correspondence should be addressed. E-mail: arkin@ 
cc.huji.ac.il. Telephone/Fax: +-972-2-658-4329. 

' Abbreviations: AMP, ampicillin; CAT, chloramphenicol acetyl 
transferase; CNS, crystallography and NMR system; DMPC, dimyris- 
toylphosphocholine; MBP, maltose-binding protein; MD, molecular 
dynamics; PAGE, polyacrylamide gel electrophoresis; rmsd, root-mean- 
square deviation; SARS, severe acute respiratory syndrome; SCoV, 
SARS coronavirus; SDS, sodium dodecyl sulfate; TLC, thin-layer 
chromatography; TMD, transmembrane domain. 


10.1021/bi060953v CCC: $33.50 


forming the known GxxxG motif, a unique feature of the 
SCoV that is the subject of this work. 

Coronavirus spike proteins are large multifunctional, 
homo-oligomeric proteins (5—9), forming petal-shaped spikes 
on the virus envelope. It is this morphological feature from 
which coronaviruses derive their name (/0). These type-I 
membrane proteins are responsible for attachment and entry 
of the virus into the target cell membrane, thereby determin- 
ing tissue tropism and host specificity (/7—13). The spike 
protein is also the predominant antigenic determinant of the 
virus (J0, 14). 

Structurally, the SCoV spike protein contains 1255 resi- 
dues and, by analogy to other coronavirus spike proteins, is 
divided into two functional domains, S1 and S2, comprising 
the N- and C-terminal halves, respectively. $1 is responsible 
for binding to cellular receptors, thereby determining host 
range, while the membrane-anchored S2 is important for viral 
entry into cells. Unlike other class I fusion proteins, the SCoV 
S protein is not post-translationally cleaved into the $1 and 
S2 domains (8). In contrast, it was suggested that, after 
binding to cellular receptors, the SCoV spike protein 
undergoes proteolysis within endosomes (/5). 

The oligomeric nature of other coronavirus spike proteins 
was initially shown to be homotrimeric (5), although other 
reports in the literature have shown that the S1 domain forms 
dimers (6). More recent studies on the SCoV spike protein 
have clearly shown it to be homotrimeric (8), but there is 
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Table 1: Transmembrane Sequences Used in the TOXCAT System (24)4 


plasmid name sequence residues 
GPAwt LIIFGVMAGVIGTIL 75-89 

G83I LIIFGVMAIVIGTIL 75—89 

pecKan no TM 

SARSwt YIKWPWYVWLGFIAGLIAIVMVTILL 1191-1216 

G12011 YIKWPWYVWLIFIAGLIAIVMVTILL 1191-1216 
G1201,12051 YIKWPWYVWLIFIAILIAIVMVTILL 1191-1216 

HCVwt YVKWPWYVWLLICLAGVAMLVLLFFI 1293-1318 

MHVwt YVKWPWYVWLLIGLAGVAVCVLLFFI 1313-1338 

G83I+ YIKWPWYVWLLIIFGVMAIVIGTIL SARS(1191—1200) + GPA(75—89) 
G83I+ran PYITYLVWWKWLUIFGVMAIVIGTIL 


“ The indicated sequences were cloned between the ToxR cytoplasmic DNA-binding domains and MBP, to create the ToxR(TM)MBP chimera, 
used for measuring the relative strength of oligomerization in the E. coli inner membrane (24). 


some evidence that part of the soluble S1 domain forms 
dimers (7). Moreover, it was shown that trimerization is a 
function of 60 amino acids at the C-terminal region of the 
protein, encompassing the cytoplasmic tail and the TMD (8). 
Herein, we are able to explicitly delineate the structural 
elements that potentiate this trimerization reaction. 

In relation to other proteins in the virus, the spike protein 
(mostly the S1 segment) is not as conserved: 24% identity 
to other coronaviruses, versus an average 39% identity for 
all SCoV proteins (2, 3). However, it does contain a highly 
conserved juxtamembranuous sequence element in S2, part 
of the known PFAM motif: PFO1601. It is this sequence 
element next to which the unique GxxxG motif was 
identified. 

The GxxxG motif is a known transmembrane dimerization 
signal (J6—21). The gap of three amino acids between the 
glycine residues in an a-helical structure aligns the two 
glycines on the same face of the helix. This alignment creates 
a flat platform, to which the two glycine residues from 
another helix can associate and form a symmetric homo- 
dimer. In this close positioning, the glycine residues can 
donate their C, hydrogen to form a hydrogen bond with a 
carbonyl oxygen atom from an adjacent helix (22). Such a 
C,—H:::O hydrogen bond can stabilize the helix—helix 
interactions by 0.88 kcal/mol (23). 

Here, we report the influence of the GxxxG motif upon 
the oligomerization of the SCoV spike protein TMD and 
upon the trimerization of the full-length protein. We also 
present a structural model, demonstrating the effect of this 
motif on the trimeric structure of the SCoV spike protein 
TMD, compared to other coronaviruses. As opposed to all 
other cases reported in the literature, this is the first example 
of the involvement of the GxxxG motif in a trimerization 
reaction. 


MATERIALS AND METHODS 


Plasmid Construction. Molecular cloning was carried out 
using standard procedures (36). The cloning of the plasmid 
for the ToxR experiments proceeded as follows: the sense 
oligonucleotide corresponding to the sequence described in 
Table 1 was synthesized including 5’ NheI and 3’ BamHI 
restriction sites. Double-stranded inserts were synthesized 
using a Taq-polymerase enzyme, after hybridizing a corre- 
sponding anti-sense 30-mer to the 3’ extension of the sense 
oligonucleotide. The resulting purified double-stranded prod- 
ucts were digested with Nhel and BamHI and gel-purified. 
The expression vector pccKAN (a kind gift of D. M. 


Engelman) (24) was digested with NheI and BamHI and 
isolated by gel purification. The inserts were ligated in-frame 
into the digested vector using phage T4 DNA ligase, at 16 
°C for about 18 h. The resulting plasmids, encoding the 
ToxR(TM)maltose-binding protein (MBP) chimera were 
transformed into Escherichia coli DHS5a competent cells and 
plated on Luria—Bertani (LB) agar containing 200 ug/mL 
ampicillin (AMP). All assays were preformed on transformed 
E. coli NT326 cells (a kind gift of D. M. Engelman) 
expressing the ToxR(TM)MBP chimera. This strain lacks 
the endogenous MBP (malE_), resulting in its inability to 
transport maltose into the cytoplasm. 


Maltose Complementation Assay. NT326 cells expressing 
the ToxR(TM)MBP chimera were cultured on M9 minimal 
medium plates containing 0.4% maltose as the only carbon 
source, 1% Bacto agar (Difco, MD), and 100 ug/mL AMP, 
for 3 days at 37 °C. Transformed cells expressing a chimera 
not containing a TMD (pcckan) were used as a control. The 
malE~ NT326 strain is unable to grow on media in which 
the only carbon source is maltose. Proper orientation of the 
ToxR(TM)MBP protein in the inner membrane will localize 
the MBP portion in the periplasm, allowing it to complement 
the NT326 malE~ phenotype and support growth on maltose 
as the only carbon source. As seen in Figure 2, all constructs 
enabled the survival on M9-maltose medium, indicating 
correct membrane insertion and orientation. As expected, 
control cells expressing a construct with no TMD (pecKan) 
were unable to grow on M9-maltose media. 


Chloramphenicol Acetyl Transferase (CAT) Assay. In a 
typical experiment, competent NT326 cells were transformed 
with the appropriate plasmid and plated on LB agar plates 
containing 200 wg/mL AMP. Three different colonies were 
incubated overnight at 37 °C in LB medium containing 200 
ug/mL AMP, diluted 100-fold into fresh medium, and 
regrown at 37 °C to mid-logarithmic phase. A total of 200 
UL of cell culture normalized to an OD¢00 of 0.6 was pelleted 
and resuspended in 500 wL of 100 mM Tris:HCl (pH 8.0). 
Lysis was achieved by the addition of 20 wL of lysis buffer 
[100 mM dithiothreitol (DTT), 100 mM ethylenediamine- 
tetraacetic acid (EDTA), and 50 mM Tris-HCl at pH 8.0] 
and 40 wL of toluene followed by incubation at 30 °C for 
30 min. A total of 5 wL of cell lysate was diluted 1:10 with 
100 mM Tris-HCl at pH 8.0, and CAT activity was measured 
with 10 wL of the diluted cell lysate, in the presence of 4 
mM/h acetyl CoA and ['4C]chloramphenicol (Amersham 
Biosciences, U.K., 0.05 uwCi/assay). Reactions proceeded by 
incubation at 37 °C for 20 min and were terminated by the 
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addition of 200 wL of 250 mM TrissHCl at pH 7.5 and 
extraction with 250 wL of ethyl acetate. Except for the 
incubation time at 37 and 30 °C, reaction mixtures were 
always kept on ice. ['4C]chloramphenicol and its acetylated 
products were separated by thin-layer chromatography (TLC) 
on silica-gel plates (60 F250, Merck, Germany) using a solvent 
system composed of chloroform/methanol (95:5, v/v). The 
amount of radioactivity in the substrate and its acetylated 
products was quantified using a phosphorimager (Fuji Bio- 
Imaging analyzer, BAS-100). Results are represented as the 
percentage of [!4C]chloramphenicol converted to the acety- 
lated products. 

Expression in Tissue-Culture Cells. The plasmid express- 
ing full-length SCoV spike protein was a kind gift of Prof. 
G. J. Nabel from the National Institutes of Health (/3). The 
sequence is according to accession number AY278741 (strain 
Urbani), and the protein is expressed under the cytomega- 
lovirus (CMV) promoter/enhancer. The plasmid encoding the 
double-mutated G1201,1205I S protein was generated using 
the QuickChange-XL kit (Stratagene) from the plasmid 
encoding the wild-type (wt) sequence. HEK 293T cells were 
maintained in Dulbecco’s modified Eagle’s medium (Beit 
Haemek) containing penicillin (100 units/mL), streptomycin 
(0.1 mg/mL), and amphotericin (0.25 mg/mL) and supple- 
mented with 10% fetal bovine serum. The cells were 
transfected by using Escort V Transfection Reagent (Sigma, 
St. Louis, MO) according to the instructions of the manu- 
facturer. A total of 40 h post-transfection, the cells were 
washed twice with ice-cold phosphate-buffered saline (PBS) 
and lysed with | x lysis buffer (50 mM Tris-HCl at pH 7.4, 
150 mM NaCl, 1 mM EDTA, and 1% Triton X-100) 
containing complete mini protease inhibitor (Roche, India- 
napolis, IN). After a 30 min incubation on ice, cell debris 
were cleared by centrifugation. The cleared lysate was used 
for Western blot analysis. 

The cell lysate was diluted with sample buffer [60 mM 
TrissHCl at pH 6.8, 100 mM DTT, 3% sodium dodecyl 
sulfate (SDS), 10% glycerol, and 0.01% bromophenol blue] 
and heated for 5 min in either 80 or 100 °C, as indicated. 
Cell lysates were resolved by SDS—4% polyacrylamide gel 
electrophoresis (SDS—4% PAGE) and transferred to a 
nitrocellulose membrane. The membrane was blocked with 
blocking buffer (5% bovine serum albumine and 0.2% Tween 
20 in PBS) for 1 h in room temperature. The blot was further 
incubated over night at 4 °C with mouse anti-SARS S protein 
monoclonal antibody (Zymed, San Francisco, CA) dilluted 
1:300 in blocking buffer. After the membrane was washed 
4 times with washing buffer (0.2% Tween 20 in PBS), it 
was incubated with horseradish-peroxidase-conjugated goat 
anti-mouse antibody (Jackson, Baltimore, PA) for | h at room 
temperature and washed. Detection was performed with ECL 
reagent (Beit Haemek, Israel) and quantified using Intelligent 
Dark Box II (LAS-1000, Fujifilm, Japan) and Image Gauge 
software (Fujifilm, Japan). 

Molecular Modeling. A global search, in which the hel- 
ices were rotated around their axis, was carried out as 
described in detail elsewhere (28). Briefly, all calcula- 
tions were performed using the crystallography and NMR 
system (CNS, version 1.1) (37) assuming initial symmetri- 
cal interaction between the helices in the homo-oligomer. 
The OPLS parameter set with a united atom topology 
was used, representing explicitly polarand aromatic hydro- 
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gens (38). Calculations were carried out using the fol- 
lowing sequences, corresponding to the predicted TMD of 
the spike protein (see Figure 6): SARS strain Urbani 
(P59594), residues W1194—L1210; bovine coronavirus 
(BCoV) strain LY-138 (P25192), residues W1306—11328; 
murine hepatitis virus (MHV) strain 4 (P22432), resi- 
dues W1316—I1338; human coronavirus (HCoV1) strain 
OC43 (P36334), residues W1296—11318; human corona- 
virus (HCoV2) strain 229E (P15423), residues W1114— 
C1136; avian infectious bronchitis virus (AIBV1) strain 
6/82 (P05135), residues W1093—I1115; avian infectious 
bronchitis virus (AIBV2) strain KB8523 (P12650), resi- 
dues W1092—V1114; porcine respiratory coronavirus 
(PRCoV) strain 86/137004 (P27655), residues W1165— 
C1187; porcine epidemic diarrhea virus (PEDV) strain 
CV777 (NP_598310), residues W1323—C1345; feline infec- 
tious peritonitis virus (FIPV) strain 79-1146 (P10033), 
residues W1392—C1414. 

All calculations were carried out in vacuo with the initial 
coordinates of a canonical @ helix (3.6 residues/turn). 
Symmetric trimers were generated from the various se- 
quences by duplicating and rotating the helix by 360°/3. An 
initial crossing angle of 25° for left-handed and —25° for 
right-handed structures was introduced by rotating the long 
helix axis with respect to the long bundle axis. The 
symmetrical search was carried out by rotating all of the 
helices simultaneously between @ = 0° and 360° in 10° steps. 
Four trials were carried out for each starting structure, using 
different initial random atom velocities. This procedure 
results in a total of 288 different structures (1.e., 36 x 4 x 
2). Each structure was subjected to a simulated-annealing 
and energy-minimization protocol. 

The resulting structures were grouped in clusters, defined 
by having more than 10 similar structures and that the Cy 
root-mean-square deviation (rmsd) of the coordinates be- 
tween every two structures within a cluster was not larger 
than 1.0 A. For each cluster, an average structure was 
calculated, energy-minimized, and subjected to a simulated- 
annealing protocol identical to that used in the systematic 
search. This average structure was taken as a representative 
of each cluster. 

Hydrated Lipid Simulations. The simulation presented 
herein was conducted using version 3.2.1 of the GROMACS 
(39) molecular dynamics (MD) simulation package (www. 
gromacs.org). An extended version of the GROMOS96 force 
field (40), implemented in GROMACS, was used. Dimyris- 
toylphosphocholine (DMPC) force-field parameters were as 
described elsewhere in detail (4/). In all simulations, the 
LINCS algorithm (42) was used to constrain bond lengths. 
A simulation time step of 2 fs was used, and atomic 
coordinates were saved every 0.5 ps. The simulation was 
conducted at a constant temperature of 310 K. Solvent, lipid, 
and protein were coupled separately to a Berendsen tem- 
perature bath (43), with a coupling constant of tT = 0.1 ps. 
The pressure was kept constant by coupling the system to a 
Berendsen pressure bath (43) of 1 bar, with a coupling 
constant of t = 3 ps. The membrane plane (xy) and 
membrane normal (z) directions were separately coupled. A 
cutoff of | nm was used for van der Waals interactions. 
Electrostatic interactions were computed using the PME 
algorithm (44), with a 1 nm cutoff for the direct space 
calculation. 
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FiGurE 1: Multiple sequence alignment of the putative spike protein 
TMD of 10 different coronaviruses. The SCoV spike protein TMD 
corresponds to residues Y1194—L1216. The accession numbers for 
each of the sequences are given in the Materials and Methods. The 
coloring is according to the chemical nature of the amino acid. 
The arrows indicate the insertion of the glycine residues in the 
SCoV spike protein TMD. 


The membrane environment in which the proteins exist 
was mimicked using a lipid bilayer initially made of 128 
DMPC lipids (downloaded from Prof. Tieleman D.P. site; 
http://moose.bio.ucalgary.ca/) embedded in 3655 molecules 
of SPC water (45). To set up the protein-membrane systems, 
a hole was generated in the bilayer using each protein 
solvent-accessible surface as a template. This methodology 
has been described in more detail elsewhere (46). 

The protein-bilayer-solvent system was energy-minimized 
followed by equilibration in two stages: (i) 0.5 ns of MD 
run with positional restraints on the protein and DMPC atoms 
and (ii) 0.5 ns of MD run with positional restraints on the 
protein atoms, to improve the packing of lipids around the 
protein. Thereafter, the equilibrated system was subjected 
to 7.5 ns of MD runs with hydrogen-containing bonds 
constrained. The simulation was set up and performed on a 
Dual 2 GHz G5 workstation (Apple, Inc.). The time for one 
dual CPU machine was less than 21 h/ns of simulation. 


RESULTS AND DISCUSSION 


In this work, we try to find whether the GxxxG motif of 
the SCoV TMD is involved in oligomeric structures other 
than dimers and what is the structural outcome of this 
involvement. The contradiction between the classical role 
of the GxxxG motif as a dimerization signal and the trimeric 
structure of the SCoV spike protein is more intriguing when 
considering the fact that the TMD and carboxy terminus of 
the spike protein play a role in stabilizing the trimeric 
structure of the protein (7—9). 

SCoV S Contains an Insertion of Two Glycine Residues 
in Its TMD. Sequence alignments of the coronavirus spike 
proteins revealed that, unique to the SARS virus, two glycine 
residues are inserted into the membrane segment of the spike 
protein. Moreover, the insertion is adjacent to a highly 
conserved sequence segment, as shown in Figure |. The exact 
spacing of three amino acids between the two glycine 
residues identifies the insertion as the GxxxG motif, a known 
transmembrane dimerization signal (J6—2/). 

The identification of an inserted GxxxG motif in the SCoV 
spike protein TMD poses an immediate “oligomerization 
riddle”: the GxxxG motif is a known dimerization signal 
(16, 17), while the SCoV spike protein is known to be 


Arbely et al. 


trimeric (S). Other questions that come up from this finding 
are what, if any, is the role of the SCoV spike protein TMD 
in the trimerization reaction and what is the significance of 
the unique insertion of the GxxxG motif on the biology of 
SCoV? Below, we attempt to experimentally address the first 
two questions and discuss the third. 

SCoV Spike Protein TMD Oligomerization in Bacterial 
Membranes. The effects of inserting two glycine residues, 
forming the GxxxG motif, upon the oligomerization of the 
SCoV spike protein TMD, were investigated experimentally 
using the ToxR/TOXCAT system (/8, 24). In this system, 
the TMD of interest is fused between the cytoplasmic domain 
of the Cholera vibrio transcriptional activator (ToxR) and 
the periplasmic domain of the MBP. Oligomerization of the 
resulting chimera, driven by the TMD, results in transcription 
of the reporter gene, CAT, the relative concentration of which 
can be quantitated by a simple CAT assay. 

The different transmembrane segments analyzed in this 
study by the TOXCAT assay are indicated in Table 1. A 
monomeric mutant of human glycophorin A (G83) (24) was 
used as a control for basal CAT activity arising even without 
transmembrane oligomerization. The homo-oligomerization 
of wild-type SARS (SARSwt) was compared to the singly 
mutated G12011 and doubly mutated G1201,1205I mutants. 
Wild-type human coronavirus (HCVwt, strain OC43) and 
wild-type murine hepatitis virus (MHVwt, strain 4) were 
chosen as representatives of other coronaviruses. The plasmid 
G83I+ was constructed from the TMD of glycophorin A 
containing the mutation G83I (24) with the addition of the 
10 highly conserved residues (1191—1200) of the SCoV 
spike protein, to investigate the possible role of that 
conserved sequence in oligomerization. The same 10 con- 
served residues were also added in random order to the TMD 
of G83I, to create the G83I+ran, which served as a control. 

Orientation and localization of the various transmembrane 
domains were confirmed by a maltose complementation 
assay. Indicated proteins were expressed in the NT326 strain 
that lacks the endogenous MBP (malE_), resulting in its 
inability to grow on media in which the only carbon source 
is maltose. Proper orientation of the ToxR(TM)MBP protein 
in the inner membrane will localize the MBP portion in the 
periplasm, allowing it to complement the NT326 malE~ 
phenotype and support growth when maltose is the only 
carbon source. As seen in Figure 2, all constructs were 
allowed for survival on M9-maltose medium, indicating the 
correct membrane insertion and orientation. As expected, 
cells expressing a construct with no TMD (pccKan) were 
unable to grow on M9-maltose media and served as a 
negative control. 

As shown in Figure 3, cells expressing a chimera contain- 
ing the SARSwt TMD exhibited significantly higher levels 
of CAT activity than did cells containing the G83I plasmid, 
which was taken to represent basal CAT activity (see above). 
The results indicate for the first time that the TMD of the 
SCoV spike protein is capable of oligomerization. 

The extent of SCoV spike protein TMD oligomerization 
was lower than that found for the TMD of human glycoph- 
orin A, a highly dimerizing element that retains oligomer- 
ization even in the presence of SDS micelles (25). However, 
the oligomerization observed for the SARS spike protein 
TMD is similar or larger than that found in other proteins 
such as the TMD of the ErbB receptor (26). Moreover, recent 
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FIGURE 2: malE~ complementation assay to test for proper insertion 
and orientation of the ToxR(TM)MBP chimeric proteins. NT326 
cells expressing the indicated chimera were cultured on M9-maltose 
plates. Survival of the cells indicates that the chimera is correctly 
inserted into the membrane, as can be seen for all of the constructs. 
Cells expressing the pccKan plasmid, which contains no TMD, 
served as a control. 
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FIGURE 3: Quantitative CAT assay of transmembrane homo- 
oligomerization employing the TOXCAT system (24). CAT assays 
were performed on normalized quantities of lysates. Bars represents 
the mean CAT activity + standard deviation of at least three 
independent measurements, after subtraction of the basal CAT 
activity of the G83I mutant and relative to SARSwt. 


studies have shown that CAT activity, taken to represent 
oligomerization in the TOXCAT assay (24), is dependent 
upon the length of the transmembrane segment (27). There- 
fore, comparisons should only be made between transmem- 
brane sequences of equal length. For this reason, the level 
of wild-type SCoV TMD oligomerization was set as 100% 
for all future comparisons. 

Importance of the GxxxG Motif. To ascertain the impor- 
tance of the GxxxG motif in the oligomerization of the SCoV 
spike protein TMD, the CAT activity of cells containing the 
G1201I chimera was measured. The results were indicative 
of a 28% loss of oligomerization relative to SARSwt, because 
of the mutation of a single glycine in the motif. Furthermore, 
mutation of the second glycine (G1201I + G1205I) reduced 
the apparent oligomerization by a further 33% (total reduction 
of 61%). Thus, it can be concluded that not only is the SCoV 
spike protein TMD oligomeric but that the uniquely inserted 
GxxxG motif plays a pivotal role in the oligomerization. 

Lack of Influence of the Juxtamembranous Part of the 
PFO1601 Motif upon Oligomerization. The influence of the 
highly conserved juxtamembranous sequence element (part 
of the PFam PFO1601 motif) upon oligomerization was 
analyzed as well, using a chimera in which it was added to 
the G83I monomeric mutant of glycophorin A (representing 
the oligomerization basal level). The oligomerization of this 
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FIGURE 4: Western blot SDS—PAGE of the SCoV wild-type protein 
and G1201,1205I mutant expressed in HEK293T cells. Anti-SCoV 
spike protein antibodies were used to detect the presence of the 
protein. Samples were heated to either 80 or 100 °C prior to 
electrophoresis as indicted. The sizes of the molecular-weight 
markers as well as the likely positions of the spike protein monomer 
and trimer are indicated. 
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construct was similar to that of the G1201,1205I mutant and 
smaller than SARSwt TMD by 46%. Most importantly, no 
significant difference was seen upon the addition of a 
randomized sequence of the aforementioned sequence ele- 
ment. These results indicate that the difference in the 
oligomerization state between the chimeric proteins and the 
G83I mutant is length-dependent and not sequence-depend- 
ent. Thus, the highly conserved juxtamembranous sequence 
element does not seem to contribute significantly toward 
oligomerization of the TMD of the SCoV spike protein. 

Oligomerization of the TMD of the Human Coronavirus 
Spike Protein. Interestingly, despite the fact that it does not 
contain a GxxxG motif, the TMD of the human coronavirus 
(HCVwt) spike protein also seems to be oligomeric to an 
extent similar to wild-type SCoV TMD (see lane 4 in Figure 
3). The murine hepatitis virus (MHVwt) spike protein on 
the other hand is about 70% oligomeric relative to SARS 
(lane 5 in Figure 3). It may be possible to gain insight into 
this second contribution toward oligomerization in coronavi- 
ruses by noticing that only three amino acids differ between 
HCoV1 and MHV (Figure 1). One obvious possibility is that 
C1305 in HCoV participates in an interhelical disulfide bond, 
not found in MHV. However, it is clear that even without 
the GxxxG motif, transmembrane segments of other coro- 
navirus spike proteins are able to oligomerize, albeit differ- 
ently from the SCoV spike protein. 

Importance of the GxxxG Motif in Trimerization of the 
SCoV Spike Protein Expressed by Mammalian Cells. As 
stated above, all data in the literature regarding the GxxxG 
motif pertain to its involvement in dimerization processes. 
However, full-length SCoV spike protein is known to be 
trimeric (8, 9). Because the ToxR/TOXCAT system (/8, 24) 
cannot distinguish between the different oligomeric forms, 
we have expressed the full-length SCoV spike protein in 
tissue-culture cells to reconcile this conundrum. In addition, 
such experiments will also enable us to gauge the importance 
of the TMD in the context of the entire spike protein. 

Toward this end, we have transiently expressed SCoV 
wild-type and mutant proteins in HEK 293T cells. As shown 
in Figure 4, SDS-PAGE of the full-length SCoV spike 
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protein shows that the protein migrates in the gel as two 
species with a molecular weight of 510 and 170 kD. These 
species correspond to the molecular weights of the trimeric 
and monomeric forms of the protein, respectively. Trimer- 
ization persists even when heating the samples in 3% SDS 
sample buffer to 80 or 100 °C prior to electrophoresis. 
Densitometer quantification indicates that the wild-type 
protein is 31—35% more trimeric than the mutant in which 
both glycines were replaced by isoleucine. 

Thus, not only does the GxxxG motif contribute toward 
oligomerization of the isolated TMD of the protein, it also 
promotes the trimerization of the protein in the context of 
the entire protein. This analysis allows us to unequivocally 
state that the GxxxG motif in the SCoV spike protein 
potentiates trimerization. To the best of our knowledge, this 
is the first experimental proof that a GxxxG motif participates 
in trimerization. Moreover, the GxxxG motif in SCoV 
represents the first transmembrane motif that is important 
for trimerization. 

Molecular Modeling of the SARS Spike Protein TMD. The 
ability of the GxxxG motif to promote dimerization has been 
well-established in the literature (for example, see refs /6— 
18 and 20). Specifically, it was postulated (22) and recently 
shown experimentally (23) that the close apposition of 
transmembrane helices, facilitated by the small size of the 
glycine, promotes the formation of a C,—H:*:O hydrogen 
bond. 

To better understand the implications of the insertion of a 
GxxxG motif upon the structure of the TMD of SCoV spike 
protein and what a trimeric GxxxG structure might look like, 
we turned to molecular modeling. Candidate models for the 
transmembrane helical bundles were derived using global- 
searching MD simulations (28). This procedure attempts to 
derive a model for a transmembrane helical bundle by 
exploring different starting configurations using MD. Specif- 
ically, 72 different starting configurations are used, whereby 
each structure was distinguished by the rotational angle of 
the helices (in increments of 10°) and by right- or left-handed 
crossing angles made between the helices (25°). The 
process is repeated 4 times, leading to a total number of 36 
x 2 x 4 = 288 structures sampled. When global-searching 
MD simulations are taken together, they yield several 
candidate structures for the corresponding transmembrane 
helical bundle. 

The process of global search MD was conducted for 
transmembrane segments of spike proteins from 10 different 
coronaviruses (Figure 1). This enabled us to locate structures 
that are found in all of the sequences (29) and those that are 
unique to one sequence. Close inspection of Figure 5 reveals 
a structure in SCoV that stands out from all other structures 
in any of the other viral variants. It appears at a rotational 
pitch angle of 113° and a crossing angle of —18° (see dotted 
circle). However, aside from its unique rotational and 
crossing angles, what distinguishes this structure from all 
of the rest is the close positioning of the helices shown by 
the color scale. 

Examination of this unique structure reveals that the 
GxxxG motif is located in the bundle center, as shown in 
Figure 6A. Thus, it may be possible to state that the insertion 
of the GxxxG motif changes the structure of the TMD of 
the SCoV spike protein in comparison to the structure 
adopted by spike proteins from all other coronaviruses. 
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FIGURE 5: Results of the global MD searches (28) for 10 different 
coronavirus spike proteins. TMD trimers plotted as a function of 
their orientation. The color coding indicates the closest approach 
between three Cys in the different helices of the trimer, according 
to the color scale given at the top of the plot. The dotted gray circle 
indicates the unique SARS structure, not present in any of the other 
viruses. Note that it is also the structure in which the helices are 
the closest among all of the variants. The sequences of the different 
coronavirus spike protein TMDs are given in Figure 1. 


To refine the results obtained in the in vacuo modeling, 
we further analyzed the unique SARS TMD structure using 
multi-nanosecond MD simulations of the complex embedded 
in a fully hydrated phospholipid bilayer (Figure 6C). The 
7.5 ns simulation revealed that the structure is stable, leading 
further credence to the validity of this model, in which the 
GxxxG motif has been shown, for the first time, to promote 
trimerization. 

Possible Implications upon Viral Function. Considering 
the fact that the spike protein is responsible for attachment, 
entry, and antigenicity of the virus, the consequences of 
inserting an oligomerization domain must be discussed. 
Because of the health hazard that the SCoV poses, it is 
difficult, if not impossible, to determine directly the implica- 
tions of inserting the GxxxG oligomerization motif upon 
SCoV virulence. However, considering the results presented 
here and in other recent publications, we would like to 
conduct a theoretical discussion regarding the possible 
influence of the GxxxG motif upon viral virulence. 

Oligomerization is key to the function of the spike protein 
(9, 10, 30); therefore, any modification in its nature, such as 
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FIGURE 6: Structural model obtained from MD simulations for the 
SARS spike protein TMD. (A and B) Structures obtained from in 
vacuo modeling in side and top view, respectively. The glycine 
residues forming the GxxxG motifs are shown in a ball-and-stick 
representation. (C) Hydrated lipid bilayer simulation system of the 
SCoV spike protein TMD. The protein is shown in a ribbon 
representation in blue, with the two glycine residues in a ball-and- 
stick representation. The lipid molecules are shown in bond format, 
with hydrocarbon tails in gray and polar atoms, N, O, and P, in 
blue, red, and purple, respectively. The solvent water molecules 
are represented in a ball-and-stick representation. This figure was 
generated by VMD (47) and rendered in PovRay (48). 


a change in the trimerization surface (i.e., contact regions) 
or in oligomerization stability, is likely to cause considerable 
changes in the function of the spike protein. In a recent 
publication, Corver and co-workers tested the role of the 
SCoV spike protein TMD during entry to target cells (9). 
The infectivity of the SCoV spike protein contains different 
TMDs was tested using SCoV spike protein pseudotyped 
viruses. Also, the cell—cell fusion activity of these proteins 
was tested using a luciferase-based cell—cell fusion assay. 

When the SCoV TMD and cytoplasmic tail where replaced 
by those originating from MHV, fusion activity was reduced 
by 40-50%. When the same domains where replaced 
by those originating from the vesicular stomatitis virus G 
(VSV-G) protein (also contains the GxxxG motif), infectivity 
was less than 5% of the wild-type sequence and cell—cell 
fusion was just 10-30% of the wild-type sequence. In 
addition, the replacement of only the cytoplasmic domain 
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with the VSV-G cytoplasmic domain had a mild effect, 
clearly indicating that the TMD is crucial for infectivity and 
fusion activity. Most importantly, the trimeric structure of 
the chimeric protein contains the TMD, and the cytoplasmic 
domain of VSV-G was less stable than all of the other 
constructs. This result strengthens the assumption that 
trimerization is linked with higher fusion activity. 

Although there is no direct proof for the influence of the 
trimeric structure stability upon fusion activity, this option 
was already considered (30). Moreover, we found a good 
correlation between the extent of oligomerization that we 
measured using the CAT assay and the fusion activity 
measured by Broer et al. (9). In our tests, the TMD of MHV 
was 70% oligomeric relative to the SCoV wild-type se- 
quence, compared to 50—60% fusion activity measured by 
Corver and co-workers (9). 

It should be noted that the VSV-G TMD contains the 
GxxxG motif and that mutations of the glycines in the 
GxxxG motif of the VSV spike protein abolished viral 
infectivity and virulence (3/). 

When these results are taken together, the influence of 
the oligomerization of the TMD of viral spike proteins upon 
viral virulence begins to be appreciated. Therefore, our 
finding that the SCoV spike protein contains an inserted 
active transmembrane trimerization motif may bear impact 
upon viral virulence, but this should be further investigated. 

Oligomeric Versatility of the GxxxG Motif. As stated 
above, all data pertaining to the GxxxG sequence signature 
has thus far implicated the motif in dimeric interactions (for 
example, see refs 16, 26, 27, and 32—35). Surprisingly, in 
the current study, we show that not only is a GxxxG motif 
present in a trimeric transmembrane helical bundle but that 
it is one of the driving forces of the trimerization reaction. 
To the best of our knowledge, this is the first demonstration 
of a sequence motif that drives transmembrane helical 
trimerization. Finally, it remains to be seen why some GxxxG 
motifs result in dimerization, while others contribute to 
trimeric assemblies. 
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